On the mutual nearest neighbors estimate in regression
نویسندگان
چکیده
Motivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y |X = x] as follows: first identify the k nearest neighbors of x in the sample Dn, then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting. Index Terms — Nonparametric estimation, Nearest neighbor methods, Mathematical statistics. 2010 Mathematics Subject Classification: 62C10, 62F15, 62G20.
منابع مشابه
Bayesian Kernel and Mutual $k$-Nearest Neighbor Regression
We propose Bayesian extensions of two nonparametric regression methods which are kernel and mutual k-nearest neighbor regression methods. Derived based on Gaussian process models for regression, the extensions provide distributions for target value estimates and the framework to select the hyperparameters. It is shown that both the proposed methods asymptotically converge to kernel and mutual k...
متن کاملMutual Information and Conditional Mean Prediction Error
Mutual information is fundamentally important for measuring statistical dependence between variables and for quantifying information transfer by signaling and communication mechanisms. It can, however, be challenging to evaluate for physical models of such mechanisms and to estimate reliably from data. Furthermore, its relationship to better known statistical procedures is still poorly understo...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملERAF: A R Package for Regression and Forecasting
We present a package for R language containing a set of tools for regression using ensembles of learning machines and for time series forecasting. The package contains implementations of Bagging and Adaboost for regression, and algorithms for computing mutual information, autocorrelation and false nearest neighbors.
متن کاملEstimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest
Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 14 شماره
صفحات -
تاریخ انتشار 2013